{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Creating tables in SQL\n",
    "---------------------\n",
    "\n",
    "Before we actually get into basic SQL queries (**asking questions _of_ data in tables**), we'll look at some of the basics about how to **create** tables.\n",
    "\n",
    "**NOTE: Make sure to have a copy of the database file, \"dataset_1.db\", from the last lecture downloaded and in this directory for the below to work!**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/tarabalakrishnan/.local/lib/python2.7/site-packages/IPython/config.py:13: ShimWarning: The `IPython.config` package has been deprecated. You should import from traitlets.config instead.\n",
      "  \"You should import from traitlets.config instead.\", ShimWarning)\n",
      "/Users/tarabalakrishnan/.local/lib/python2.7/site-packages/IPython/utils/traitlets.py:5: UserWarning: IPython.utils.traitlets has moved to a top-level traitlets package.\n",
      "  warn(\"IPython.utils.traitlets has moved to a top-level traitlets package.\")\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "u'Connected: None@dataset_1.db'"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%load_ext sql\n",
    "%sql sqlite:///dataset_1.db"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Activity 2-1:\n",
    "------------\n",
    "\n",
    "Schemas & table creation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Recall that the database we just loaded has one table, `precipitation_full`, having the following schema:\n",
    "\n",
    "> * `state_code`\n",
    "> * `station_id`\n",
    "> * `year`\n",
    "> * `month`\n",
    "> * `day`\n",
    "> * `hour`\n",
    "> * `precipitation`\n",
    "> * `flag_1`\n",
    "> * `flag_2`\n",
    "\n",
    "Each tuple in this table describes one hour of rainfall (`precipitation`- in hundredths of an inch) at one station (`station_id`) in one state (`state_code`).  Note that tuples with `hour=25` record the total rainfall for that day, and that we can ignore the values of attributes `flag_1` and `flag_2` for now."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, however, let's see how to view the **schema** of existing tables on your own; there are several ways, including but not limited to:\n",
    "* DESCRIBE tablename\n",
    "* SHOW CREATE TABLE tablename\n",
    "* SHOW COLUMNS tablename\n",
    "\n",
    "Unfortunately, support for these varies widely between DBMSs, and is also limited by our IPython interface (for example sqlite, which we are using, does not support the above; it does have a `.schema tablename` command, however this doesn't work in IPython notebooks...)\n",
    "\n",
    "One that does work for us here though is:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Done.\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "    <tr>\n",
       "        <th>cid</th>\n",
       "        <th>name</th>\n",
       "        <th>type</th>\n",
       "        <th>notnull</th>\n",
       "        <th>dflt_value</th>\n",
       "        <th>pk</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>0</td>\n",
       "        <td>state_code</td>\n",
       "        <td>INT</td>\n",
       "        <td>0</td>\n",
       "        <td>None</td>\n",
       "        <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>1</td>\n",
       "        <td>station_id</td>\n",
       "        <td>INT</td>\n",
       "        <td>0</td>\n",
       "        <td>None</td>\n",
       "        <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>2</td>\n",
       "        <td>year</td>\n",
       "        <td>INT</td>\n",
       "        <td>0</td>\n",
       "        <td>None</td>\n",
       "        <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>3</td>\n",
       "        <td>month</td>\n",
       "        <td>INT</td>\n",
       "        <td>0</td>\n",
       "        <td>None</td>\n",
       "        <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>4</td>\n",
       "        <td>day</td>\n",
       "        <td>INT</td>\n",
       "        <td>0</td>\n",
       "        <td>None</td>\n",
       "        <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>5</td>\n",
       "        <td>hour</td>\n",
       "        <td>INT</td>\n",
       "        <td>0</td>\n",
       "        <td>None</td>\n",
       "        <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>6</td>\n",
       "        <td>precipitation</td>\n",
       "        <td>INT</td>\n",
       "        <td>0</td>\n",
       "        <td>None</td>\n",
       "        <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>7</td>\n",
       "        <td>flag_1</td>\n",
       "        <td>VARCHAR(1)</td>\n",
       "        <td>0</td>\n",
       "        <td>None</td>\n",
       "        <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>8</td>\n",
       "        <td>flag_2</td>\n",
       "        <td>VARCHAR(1)</td>\n",
       "        <td>0</td>\n",
       "        <td>None</td>\n",
       "        <td>0</td>\n",
       "    </tr>\n",
       "</table>"
      ],
      "text/plain": [
       "[(0, u'state_code', u'INT', 0, None, 0),\n",
       " (1, u'station_id', u'INT', 0, None, 0),\n",
       " (2, u'year', u'INT', 0, None, 0),\n",
       " (3, u'month', u'INT', 0, None, 0),\n",
       " (4, u'day', u'INT', 0, None, 0),\n",
       " (5, u'hour', u'INT', 0, None, 0),\n",
       " (6, u'precipitation', u'INT', 0, None, 0),\n",
       " (7, u'flag_1', u'VARCHAR(1)', 0, None, 0),\n",
       " (8, u'flag_2', u'VARCHAR(1)', 0, None, 0)]"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%sql PRAGMA table_info(precipitation_full);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A bit verbose, but gets the job done!\n",
    "\n",
    "And, we can get the exact statement used to create the table as follows (**a great way to find guidance here!!**):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Done.\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<table>\n",
       "    <tr>\n",
       "        <th>sql</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td>CREATE TABLE precipitation_full(state_code INT, station_id INT, year INT, month INT, day INT, hour INT, precipitation INT, flag_1 VARCHAR(1), flag_2 VARCHAR(1))</td>\n",
       "    </tr>\n",
       "</table>"
      ],
      "text/plain": [
       "[(u'CREATE TABLE precipitation_full(state_code INT, station_id INT, year INT, month INT, day INT, hour INT, precipitation INT, flag_1 VARCHAR(1), flag_2 VARCHAR(1))',)]"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%sql SELECT sql FROM sqlite_master WHERE name = 'precipitation_full';"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Without going into full detail (yet), the above table contains one record for each hour at each station, and contains the amount of precipitation that was measured during that hour.\n",
    "\n",
    "Suppose that CS 145 has been repurposed as a rain measurement corps to assist with the department that collected this data!  Based on what we've covered so far, the above example, and the internet, create a table for storing the staff assignments.  Table requirements:\n",
    "* Everyone in the class will be holding a cup in the rain for a specific several-hour shift at a specific station; this assignment will remain the same every day\n",
    "* Each person will have one off-day per week\n",
    "* Each person's cup might be of a different size, measured as a float value\n",
    "* The Dept. of Interior data servers can't handle the full dataset we would generate, and require a random subsample- so some people will be randomly chosen to stand in the rain without a cup.  These assignments need to be recorded somehow in the table too.\n",
    "* Some people in the class have [Welsh names](https://www.youtube.com/watch?v=fHxO0UdpoxM)\n",
    "\n",
    "Type your create table statement here:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%sql\n",
    "CREATE TABLE rain_corps_assignments(\n",
    "    student_id INT PRIMARY KEY,\n",
    "    name VARCHAR(1000),\n",
    "    station_id INT,\n",
    "    state_code INT,\n",
    "    start_hour INT,\n",
    "    end_hour INT,\n",
    "    holding_cup BOOLEAN,\n",
    "    cup_size FLOAT,\n",
    "    off_day INT\n",
    ")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
